Search CORE

2 research outputs found

Detection of deformable objects in a non-stationary scene

Author: Azary Sherif
Publication venue: RIT Scholar Works
Publication date: 01/01/2005
Field of study

Image registration is the process of determining a mapping between points of interest on separate images to achieve a correspondence. This is a fundamental area of many problems in computer vision including object recognition and motion tracking. This research focuses on applying image registration to identify differences between frames in non-stationary video scenes for the purpose of motion tracking. The major stages for the image registration process include point detection, image correspondence, and an affine transformation. After applying image registration to spatially align the image frames and detect areas of motion segmentation is applied to segment the moving deformable objects in the non-stationary scenes. In this paper, specific techniques are reviewed to implement image registration. First, I will present other work related to image registration for feature point extraction, image correspondence, and spatial transformations. Then I will discuss deformable object recognition. This will be followed by a detailed description on the methods developed for this research and implementation. Included is a discussion on the Harris Corner Detection operator that allows the identification of key points on separate frames based on detecting areas in frames with strong contrasts in intensity values that can be identified as corners. These corners are the feature points that are comparable between frames. Then there will be an explanation on finding point correspondences between two separate video frames using ordinal and orientation measures. When a correspondence is made, the data acquired from the image correspondence calculations will be used to apply translation to align the video frames. With these methods, two frames of video can be properly aligned and then subtracted to detect deformable objects. Finally, areas of motions are segmented using histograms in the HSV color space. The algorithms are implemented using INTEL?s open computer vision library called OpenCV. The results demonstrate that this approach is successful at detecting deformable objects in non-stationary scenes

RIT Scholar Works

Grassmann Learning for Recognition and Classification

Author: Azary Sherif
Publication venue: RIT Scholar Works
Publication date: 01/01/2014
Field of study

Computational performance associated with high-dimensional data is a common challenge for real-world classification and recognition systems. Subspace learning has received considerable attention as a means of finding an efficient low-dimensional representation that leads to better classification and efficient processing. A Grassmann manifold is a space that promotes smooth surfaces, where points represent subspaces and the relationship between points is defined by a mapping of an orthogonal matrix. Grassmann learning involves embedding high dimensional subspaces and kernelizing the embedding onto a projection space where distance computations can be effectively performed. In this dissertation, Grassmann learning and its benefits towards action classification and face recognition in terms of accuracy and performance are investigated and evaluated. Grassmannian Sparse Representation (GSR) and Grassmannian Spectral Regression (GRASP) are proposed as Grassmann inspired subspace learning algorithms. GSR is a novel subspace learning algorithm that combines the benefits of Grassmann manifolds with sparse representations using least squares loss §¤1-norm minimization for improved classification. GRASP is a novel subspace learning algorithm that leverages the benefits of Grassmann manifolds and Spectral Regression in a framework that supports high discrimination between classes and achieves computational benefits by using manifold modeling and avoiding eigen-decomposition. The effectiveness of GSR and GRASP is demonstrated for computationally intensive classification problems: (a) multi-view action classification using the IXMAS Multi-View dataset, the i3DPost Multi-View dataset, and the WVU Multi-View dataset, (b) 3D action classification using the MSRAction3D dataset and MSRGesture3D dataset, and (c) face recognition using the ATT Face Database, Labeled Faces in the Wild (LFW), and the Extended Yale Face Database B (YALE). Additional contributions include the definition of Motion History Surfaces (MHS) and Motion Depth Surfaces (MDS) as descriptors suitable for activity representations in video sequences and 3D depth sequences. An in-depth analysis of Grassmann metrics is applied on high dimensional data with different levels of noise and data distributions which reveals that standardized Grassmann kernels are favorable over geodesic metrics on a Grassmann manifold. Finally, an extensive performance analysis is made that supports Grassmann subspace learning as an effective approach for classification and recognition

RIT Scholar Works